We propose an efficient solution to the problem of sparse linear prediction analysis of the speech signal. Our method\r\nis based on minimization of a weighted l2-norm of the prediction error. The weighting function is constructed such\r\nthat less emphasis is given to the error around the points where we expect the largest prediction errors to occur (the\r\nglottal closure instants) and hence the resulting cost function approaches the ideal l0-normcost function for sparse\r\nresidual recovery. We show that the efficient minimization of this objective function (by solving normal equations of\r\nlinear least squares problem) provides enhanced sparsity level of residuals compared to the l1-norm minimization\r\napproach which uses the computationally demanding convex optimization methods. Indeed, the computational\r\ncomplexity of the proposed method is roughly the same as the classic minimum variance linear prediction analysis\r\napproach. Moreover, to show a potential application of such sparse representation, we use the resulting linear\r\nprediction coefficients inside a multi-pulse synthesizer and show that the corresponding multi-pulse estimate of the\r\nexcitation source results in slightly better synthesis quality when compared to the classical technique which uses the\r\ntraditional non-sparse minimum variance synthesizer.
Loading....